Different estimation methods of the modified Kies Topp-Leone model with applications and quantile regression

This paper introduces the modified Kies Topp-Leone (MKTL) distribution for modeling data on the (0, 1) or [0, 1] interval. The shapes of the density and hazard rate functions manifest desirable shapes, making the MKTL distribution suitable for modeling data with different characteristics at the unit interval. Twelve different estimation methods are utilized to estimate the distribution parameters, and Monte Carlo simulation experiments are executed to assess the performance of the methods. The simulation results suggest that the maximum likelihood method is the superior method. The usefulness of the new distribution is illustrated by utilizing three data sets, and its performance is juxtaposed with that of other competing models. The findings affirm the superiority of the MKTL distribution over the other candidate models. Applying the developed quantile regression model using the new distribution disclosed that it offers a competitive fit over other existing regression models.


Introduction
Distributional assumptions are innate in ascertainment of an apt parametric model for analysis.As a consequence, deciding on a germane parametric model has an interconnection with the concealed distribution governing the data generating process.Therefore exploring to identify the appropriate distribution before fitting any parametric model to any data is not only a requirement but cardinal in making right inferences.Notwithstanding the fact that innumerable distributions exist for one to hand-pick from for any analysis, providing the best fit with almost zero or minimal loss of information is essential.This has called for the appendages of existing distributions by researchers with the primary goal of ameliorating their performances.
The Topp-Leone (TL) distribution (see [1]) is one of the oldest distributions that have been recently modified by researchers to enhance its suitability in modeling data.The TL distribution with shape parameter η > 0 has cumulative distribution function (CDF) and probability density function (PDF) define respectively as and gðx; Some of the extensions of the TL distribution in literature are: new extended TL distribution by [2], cosine TL Weibull distribution by [3], tangent TL Weibull distribution by [4], modified Kies inverted TL distribution by [5], Fre ´chet TL Kumaraswamy distribution by [6], TL Weibull distribution by [7], type II power TL normal distribution by [8], sine TL inverse Lomax distribution by [9], Weibull TL generated generalized half-normal distribution by [10], inverted TL distribution by [11], TL Gompertz distribution by [12], new power TL-G family by [13], Type II generalized TL-G family by [14], Type I half-logistic TL distribution by [15], Type II TL-G family by [16] and for more information see [17][18][19][20][21][22][23][24].In this study, a new add-on of the TL distribution called modified Kies TL (MKTL) distribution is developed utilizing the modified Kies (MK) family of distributions proposed by [25].The CDF and PDF of the MK family of distributions are respectively given by Fðx; ΨÞ ¼ 1 À e where g(x; Ψ) and G(x; Ψ) are the parent PDF and CDF for the baseline distribution with a set of parameters Ψ and β is the shape parameter of the family.Furthermore, [25] employed the binomial and exponential series to reformulate the PDF as a linear amalgamation of the exponentiated family as follows: f ðx; ΨÞ ¼ gðx; ΨÞ Our motivations for the formulation of the MKTL distribution are: • Propose a new distribution capable of fitting data on the unit interval with various traits and offer optimal fit with least loss of information.
• Study the traits of the estimates of the parameters of the MKTL distribution using twelve different estimation procedures in order to pinpoint the most desired estimation method for estimating the parameters.
• Examine the utility of the MKTL distribution utilizing data sets with different characteristics and compare its performances with other competitive distributions.
• Formulate new quantile regression for modeling the relationship between an endogenous variable define on the (0, 1) interval and a set of exogenous variables.
The succeeding sections of the work are put together in the following form: The formulation of the MKTL distribution is detailed in section 2. The statistical properties of the MKTL distribution are presented in Section 3. The estimation procedures utilized to estimate the parameters of the model are described in Section 4. The simulation experiments performed to examine how well the estimation methods estimates the parameters are conferred in Section 5.The usefulness of the MKTL distribution with respect to fitting data is given in Section 6.The MKTL quantile regression model and its applications are dispensed in Section 7. The Concluding remarks are finally given in Section 8.

Formulation of the MKTL distribution
In this section, we construct the Modified Kies Topp Leone (MKTL) distribution by inserting (3) and ( 4) in ( 1) and (2) and it has the following CDF, PDF, reliability function (RF) and HRF of the MKTL of distribution are Fðx; Z; bÞ ¼ 1 À e À ½ð1À ð1À Rðx; b; ZÞ ¼ e À ½ð1À ð1À xÞ 2 Þ À Z À 1� À b ; The reversed HRF (RHRF), cumulative HRF (CHRF), odd ratio (OR), failure rate average (FRA) and Mills ratio (MR) of the MKTL of distribution are tðx; b; ZÞ We can notice some observations from Figs 1 and 2 such as: the PDF for the MKTL distribution can be declining, right-skewed, left-skewed and unimodal shapes, but the HRF can be bathtub, increasing and J-shaped for the MKTL distribution.

Statistical properties
The essential mathematical characteristics of the MKTL distribution are addressed in this section of the article.The quantile function, the median, moments, and incomplete and conditional moments are computed.

Quantile function
The u th quantile symbolized by x u of the MKTL distribution is obtained from the subsequent formula

: ð8Þ
In order to determine the median of the MKTL distribution, we substitute u = 0.5 into Eq (8) as shown: Also, by substituting u = 0.25, 0.5, 0.75 into Eq (8) we get the first (Q1), second (Q2) and third (Q3) quantiles.Table 1 shows some numerical values of quantiles for the MKTL distribution.

Moments and moment generating function
Suppose that the MKTL distribution applies to the random variable X.The w th moments of X can be calculated by inserting (1) and ( 2) in (5) as follows: By employing the binomial expansion to the previous Eq (9) as follows: where Then, the w th moments of the MKTL distribution is given by The p th incomplete moments of X can be calculated as below: Then, Table 2 show the numerical values of the moments m 0 1 , m 0 2 , m 0 3 and m 0 4 also the numerical values of variance (σ 2 ), standard deviation (σ), coefficient of skewness (CS), coefficient of kurtosis (CK) and coefficient of variation (CV) associated with the MKTL distribution.4. https://doi.org/10.1371/journal.pone.0307391.g017

Order statistics
Suppose that X 1 , X 2 , . .., X s are s random samples from the MKTL distribution with CDF (6) and PDF (7).Let X (1) , X (2) , . .., X (s) are the corresponding order statistics.The PDF of the mth order statistics is computed as below: Inserting ( 6) and ( 7) in (11), we have the PDF of X (m) of order statistics for the MKTL distribution as follows: By Putting m = 1 and s in (12), we get the smallest order statistics and the largest order statistics for the MKTL distribution as below: and 4 Estimation methods

Method of maximum likelihood
Maximum likelihood estimation (MLE) see ( [26,27]), maximizes the log-likelihood function to estimate η and β, the log-likelihood function is, The partial derivatives of l are, To estimate η and β it is needed to solve the two simultaneous equations @l @Z ¼ 0 and @l @b ¼ 0 and that require numerical techniques and using R software.

Method of Anderson-Darling
It was introduced by [28], to estimate η and β we minimize its function which is The partial derivatives of A(η, β) are, and where and

Method of maximum product of spacings
It minizes the MPS function to estimate η and β, for more about MPS see [30], 15), ( 16) and ( 17), the partial derivatives with respect to the parameters ηand β are, by solving the equations @dðx i Þ @Z ¼ 0 and @dðx i Þ @b ¼ 0 we get the estimates of η and β, that requires using R software.

Methods of least squares
By minimizing its function see [31], its function is using the Eqs ( 15), ( 16) and ( 17), the partial derivatives with respect to the parameters ηand β are, by solving the two simultaneous equations @Vðx i Þ @Z ¼ 0 and @Vðx i Þ @b we get the estimates of η and β, which needs numerical techniques and using R software.

Methods of right_tail Anderson_Darling
RADE was provided by [28], to estimate η and β we minimize its function which is ð2i À 1Þ log SðxiÞ; to find the partial derivative we use using Eqs (15), ( 16) and (17).
by solving the two simultaneous equations @Rðx i Þ @Z ¼ 0 and @Rðx i Þ @b ¼ 0 no closed form here and numerical methods are applied depending on R software.

Methods of weighted least squares
WLSE (see [31]) minimizes its function which is: using Eqs (15), ( 16) and ( 17), the partial derivatives using η and β are, by solving the two simultaneous equations @Wðx i Þ @Z ¼ 0 and @Wðx i Þ @b ¼ 0 we estimate η and β, numerical methods are applied depending on R software.

Methods of left tail Anderson Darling
LADE (see [32]) minimizes its function which is, with help of Eqs ( 15), ( 16) and ( 17), differentiating with respect to the parameters η and β, by solving the two simultaneous equations @Lðx i Þ @Z ¼ 0 and @Lðx i Þ @b ¼ 0 we estimate η and β, numerical methods are applied depending on R software.

Minimum spacing absolute-log distance
MSALD minimizes its function where Λ i = F(x i ) − F(x i−1 ) Same steps done before to estimate ηand β numerical methods are applied depending on R software.

Anderson Darling left tail second order
ADLTS (see [33]) minimizes its function, by differentiating using the parameters η and β, by solving the two simultaneous equations we estimate η and β, numerical methods are applied depending on R software.

Percentile estimation
The percentile estimation (PE) provided by [34,35] to utilize on Weibull distribution and subsequently employed for alternative distributions, its function to estimate η, and β we need to minimize PE(x i ) and repeating the procedures carried out on previous estimation methods.

Minimum spacing square-log distance
MSSL minimizes its function which is, where repeating the procedures carried out on previous estimation methods.

Simulation
In simulation scenarios, numerical techniques are employed to compute estimates for η and β, utilizing the R software to assess the average and mean square error (MSE) of these parameters.The estimation methods in Section 6 are employed for this purpose.To accomplish this, random samples of different sizes (n = 30, 150, 300, 500, 800) are drawn from the MKTL distribution, with each size replicated 1000 times.The initial values for η and β are set as shown in Table 3: The MSE plots in Figs 11-22 associated with the twelve estimation approaches outlined in Tables 4-10 are given to illustrate their performance characteristics.Moreover, Table 11 displays the summation and overall ranks for MSE values across all tables, enabling a comparative analysis of the different estimation methodologies.Upon examination of the tables, graphs, and ranks, the following observations emerge: • The MSE demonstrates a declining pattern with the increase in the variable n across all estimation techniques.4, a pattern consistent across all tables.
• As n grows, the mean estimations for the parameters (η, β) tends to converge towards their initial parameter values.
• The overall ranks displayed in Table 11 indicate that Maximum Likelihood Estimation (MLE) emerges as the superior method for parameter estimation.
The descriptive analysis of all the data sets is reported in Table 12.
The maximum likelihood estimators (MLEs) and standard errors (SEs) of the model parameters are computed.In order to assess the distribution models, various criteria are taken into account, including the Akaike information criterion (AIC), Bayesian information criterion (BIC), correct AIC (CAIC), Hannan-Quinn IC (HQIC), Kolmogorov-Smirnov (KS) test,

Modified Kies Topp-Leone quantile regression
In modeling the relationship between an endogenous variable and a set of exogenous variables, the identification of an appropriate regression model is paramount for accurate statistical inference.Quantile regression model (QRM) is known to be robust when it comes to modeling such relationship, especially in situations when the response variable contains atypical points (outliers).In this section, we propose a new QRM based on the re-parameterization of the density function of the MKTL distribution in terms of its quantile function (qf).For more details on how to formulate QRM see [44,45].Given an endogenous variable Y that has an MKTL distribution and τ 2 (0, 1) is a quantile parameter, we formulate the re-parameterize density by first making η the subject from the qf of the MKTL distribution.Thus, ; q 2 ð0; 1Þ.The re-parameterize density function in terms of the quantile is therefore given by When q = 0.10, 0.25, 0.50, 0.75, 0.90, 0.95 and 0.99, the density function of the 10 th , 25 th , 50 th , 75 th , 90 th , 95 th and 99 th percentiles are obtained respectively.The MKTL QRM is attained by adopting a monotonically increasing and twice differentiable link function to define the relationship between the exogenous variables and the conditional quantiles.Thus, we have where h(�) is the link function, τ i is the i th quantile parameter, α ¼ ða 0 ; t 1 ; . . .; a k Þ 0 is the vector of parameters to be estimated and x 0 i ¼ ð1; x i1 ; x i2 ; . . .; x ik Þ are the unknown i th vector of exogenous variables.The MKTL median regression is attained when q = 0.50.The logit link function is utilized in this study to relate the exogenous variables to the conditional quantiles.Thus, i αÞ ; i ¼ 1; 2; . . .; n: The maximum likelihood estimation procedure is adopted to estimate the parameters of the QRM and the log-likelihood function for a sample of size n is given by The estimates of the QRM parameters are attained by maximizing the log-likelihood function.

Residual analysis
Assessing the suitability of the QRM before using it for any inference is very vital.The model assessment (or diagnostics) can be done by examining its residuals.How well the residuals behave will determine the adequacy of the model for a given data.We employed the Cox-Snell residuals (CSR) (see [46]) to determine how adequate the QRM is.The CSR is given by r i ¼ À logð1 À Fðy i ; b; αÞÞ; i ¼ 1; 2; . . .; n; where Fðy i ; b; αÞ is the re-parameterized cumulative distribution function of the MKTL distribution.If the QRM provides an adequate fit to the given data, the CSR are expected to follow the standard exponential distribution.

Simulation experiment for MKTL QRM
The performance of the maximum likelihood estimation procedure is appraised in this subsection to deduce how well it estimates the parameters of the QRM.Monte Carlo simulation experiments are implemented using 1000 replications with three different scenarios of parameter combinations with sample sizes n = 50, 100, 250, 450, 850 and 1000.We scrutinize how well the estimates behave by computing metrics such as the average estimate (AE), absolute bias (AB), root mean square error (RMSE), 95% confidence interval (CI) coverage probability (CP) and the average width of the CI (AWCI).In addition, we estimated the lower confidence limit (LCL) and upper confidence limit (UCL) of the parameter estimates.The three different set of parameters utilized in the simulation are: I: α 0 = 0.4, α 1 = −0.6,α 2 = 1.2, β = 0.1, II: α 0 = 0.1, α 1 = 0.7, α 2 = 0.4, β = 0.5 and III: α 0 = −0.2,α 1 = 1.8, α 2 = 0.2, β = 4.5.Two exogenous variables were utilized in the simulation, x i1 is generated from a standard uniform distribution and x i2 is binary variable generated from Bernoulli distribution with probability 0.5.The exogenous variables were held fixed during the simulations.The observations for the response variable y i are attained via the inversion method.Thus, where u i are observations from the standard uniform distribution.We used q = 0.5 to perform the simulation.The make-up of the regression model used in the simulation experiments is The results in Tables 19-21 shows that the AEs are quite close to the true values and gets more closer as n ! 1, the ABs and RMSEs decreases as n ! 1, the CPs are quite high and closer to the nominal value of 0.95, and the AWCI gets narrower as n increases.The estimates of the CI (LCL and UCL) gets tighter as the sample size become large.This is an affirmation that the estimates of the parameters are well behaved and the estimation approach adopted is able to estimate the parameters well.

MKTL QRM application
The appositeness of the MKTL QRM is exemplified in this section by exploring the effect of labour market insecurity (LMI) and homicide rate (HR) on educational attainment value (EAV) in OECD countries.The detail description of the data can be found in [47].Mazucheli et al. [47] fitted the unit generalized half normal (UGHN) QRM to the data and unveiled the 0.1 conditional quantile as the best with AIC = −62.8264and BIC = −56.2761.Also, [48] studied the relationship between the EAV, LMI and HR utilizing the beta regression (AIC = −59.6000,BIC = −53.0.490) and log-weighted exponential mean regression (AIC = −65.2580, is arrogated to investigate the relationship.Table 22 convey the parameter estimates, standard errors, p−values and information criteria for the different conditional quantiles.The estimated parameters are all significant and the 0.01 conditional quantile appears the best from the reported information criteria.Hence, the LMI and HR have significant effect on the EAV.The fitted conditional quantiles in this study outperforms the models fitted in [47,48]. The diagnostic checks of the model residuals for the various conditional quantiles is carried out using the CSR.The probability-probability (P-P) plots in The estimate of α 0 increase as the quantile level increases whiles estimates of α 1 and α 2 approaches zero as the quantile level increases.

Concluding remarks
In this paper, we introduce another appendage of the Topp-Leone distribution hinged on the modified Kies family of distributions.Some statistical properties of the contemporary distribution are attained and twelve estimation methods utilized to estimate the parameters of the distribution.The findings of the simulation experiments affirm the maximum likelihood method as the superior for estimating the parameters of the distribution.The practicality of the new distribution is exemplified utilizing three data sets and the outcome suggest the MKTL distribution as the best when compared to other competitive distributions.The performance of the proposed quantile regression is assessed by exploring the effects of LMI and HR on EAV of OECD countries.The developed regression model provided a good fit to the given data and proved to be better than the UGHN QRM, beta and log-weighted exponential mean regression models.

Figs 7 -
10 shows the 3D plots of mean, variance, CS, and CK for the MKTL distribution.

Fig 24 .
Fig 24.Some basic non-parametric plots for the second dataset.

Fig 34 .
Fig 34.Estimated PP plots of data set 3. https://doi.org/10.1371/journal.pone.0307391.g034 Fig 35 affirm the adequacy of the model as the CSR clutch along the diagonals.The rate of change of the estimated coefficients across the various quantiles are shown in Fig 36.